Goto

Collaborating Authors

 inductive learning


Data Quality in Imitation Learning

Neural Information Processing Systems

In supervised learning, the question of data quality and curation has been overshadowed in recent years by increasingly more powerful and expressive models that can ingest internet-scale data.


A Appendix A531A.1 Detailed explanation of continuous nature of similarity

Neural Information Processing Systems

In this section, we expand on our observation that similarity between training samples is not binary. Consider the images shown in Figure 6. As a consequence, any similarity between the anchor image and the so-called'negative' examples is completely ignored. Further, all'positive' examples are considered to be The batch size is set to 16000. We train on 4 A100 GPUs.



Label Poisoning is All You Need

Neural Information Processing Systems

In a backdoor attack, an adversary injects corrupted data into a model's training dataset in order to gain control over its predictions on images with a specific attacker-defined trigger. A typical corrupted training example requires altering both the image, by applying the trigger, and the label. Models trained on clean images, therefore, were considered safe from backdoor attacks. However, in some common machine learning scenarios, the training labels are provided by potentially malicious third-parties. This includes crowd-sourced annotation and knowledge distillation. We, hence, investigate a fundamental question: can we launch a successful backdoor attack by only corrupting labels?




Neural Priming for Sample-Efficient Adaptation Matthew Wallingford Vivek Ramanujan Alex Fang Aditya Kusupati

Neural Information Processing Systems

Presented with class names or unlabeled test samples, Neural Priming enables the model to recall and conditions its parameters on relevant data seen throughout pretraining, thereby priming it for the test distribution. Neural Priming can be performed at inference, even for pretraining datasets as large as LAION-2B. Performing lightweight updates on the recalled data significantly improves accuracy across a variety of distribution shift and transfer learning benchmarks.